Extrapolation-based tuning parameters selection in massive data analysis
نویسندگان
چکیده
Many statistical modeling procedures involve one or more tuning parameters tocontrol the model complexity. These can be bandwidth in thekernel smoothing method nonparametric regression and density estimation orbe regularization parameter for feature selectionin high dimensional modeling. Tuning selection plays critical rolesin machine learning. For massive data analysis,commonly-used methods such as grid-point search with information criteriabecome prohibitively costly computation. Their feasibility isquestionable even modern parallel computing platforms.This paper aims to develop a fast algorithm efficientlyapproximate best parameters. The entails (a) assuming aparametric describe trend between andsample sizes, (b) establishing via fitting subsamplingdata, (c) extrapolating this case of huge sample size. Todetermine subsampling sizes taken, we derive optimaldesigns settings that allow constraint on budget total computational cost.We show proposed designs possess an asymptotic optimality 性质.Our numerical studies demonstrate simple two-parameter polynomial model, performsalmost equivalently procedure using full setin several different settings, while ithas significant reduction time storage.
منابع مشابه
Robust Principal Component Analysis with Adaptive Selection for Tuning Parameters
The present paper discusses robustness against outliers in a principal component analysis (PCA). We propose a class of procedures for PCA based on the minimum psi principle, which unifies various approaches, including the classical procedure and recently proposed procedures. The reweighted matrix algorithm for off-line data and the gradient algorithm for on-line data are both investigated with ...
متن کاملConsistent selection of tuning parameters via variable selection stability
Penalized regression models are popularly used in high-dimensional data analysis to conduct variable selection and model fitting simultaneously. Whereas success has been widely reported in literature, their performances largely depend on the tuning parameters that balance the trade-off between model fitting and model sparsity. Existing tuning criteria mainly follow the route of minimizing the e...
متن کاملSupplier Selection in Supply Chain Management by Data Envelopment Analysis
Nowadays, managing a supply chain is turned to be one of the fundamentals of business process. In doing so, investigating and analyzing each and every of the processes and selecting the best of each process is an important challenge for strategic managers. In this paper Data Envelopment Analysis (DEA) technique is used and a model is provided for selecting the best suppliers with flexible inpu...
متن کاملA Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Zhongguo kexue
سال: 2021
ISSN: ['1006-9267']
DOI: https://doi.org/10.1360/scm-2020-0622